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Abstract. A dynamic coupled modelling is investigated to take temperature 
into account in the individual energy consumption forecasting. The objective 
is both to avoid the inherent complexity of exhaustive SARIMAX models and 
to take advantage of the usual linear relation between energy consumption and 
temperature for thermosensitive customers. Wc first recall some issues related to 
' individual load curves forecasting. Then, we propose and study the properties of 

a dynamic coupled modelling taking temperature into account as an exogenous 
contribution and its application to the intraday prediction of energy consumption. 
Finally, these theoretical results are illustrated on a real individual load curve. The 
^ ■ authors discuss the relevance of such an approach and anticipate that it could form 

a substantial alternative to the commonly used methods for energy consumption 
■ forecasting of individual customers. 



o 

(N 



< 



C/3 



(N 



1. INTRODUCTION 



The electrical systems have to manage with new challenges: constant increasing 
demand of electricity, including the arrival of new uses such as electric vehicle, in- 
creasing production of renewable energies decentralized, increasing number of energy 
^ ■ market participants including aggregators, with a policy of DMS and CO2 reduc- 

O . tion. In particular, we note an increase in peaks load at critical times, intermittent 

injections of energy at all levels making electrical networks much more difficult to 
Oi manage, the balance supply/demand more difficult to keep and economic interests 

■ much more dispersed. On the other hand, the technological advances are enabling 

O , new opportunities: the development of smart-metering and smart grid. Indeed, there 

is a massive deployment in Europe - 80% of households will be equipped by 2020 - 
and in North America, mainly. In this context, forecasting consumption of very fine 
mesh becomes an important issue at the heart of the system of electric power. In 
. this paper, we focus exclusively our attention on the prediction of end-customer on 

I a short-term horizon. These forecasts are useful for the customers who want to opti- 

mize their bill and make DMS, for network planning and monitoring, for aggregators 
who wish to apply real-time load shifting. Forecasting methods on large aggregate 
of customers exist in abundance but cannot be extrapolated to individual curves be- 
cause of their extremely irregular behavior. Indeed, aggregation has the advantage 
of reducing the noise and provides little chaotic curves in which trend and season- 
ality are easily identifiable. For example, in the case of short-term forecasting, the 
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exponentially weighted methods of Taylor [22] give pretty good results. Harvey and 
Koopman [5T] also developed an unobserved components model with time-varying 
splines to capture the evolution of patterns in hourly electricity loads. Afterwards, 
bayesian methods that rely on Kalman filter and state space models were suggested 
by Martin [26] and Smith [28]. Multiple time series approaches have also emerged 
to model and predict intraday aggregated load curves, one can cite in example the 
heteroskedastic GARCH model of Garcia et al. [17], the seasonal ARFIMA-GARCH 
model of Koopman et al. [23], the robust SARIMA estimates of Chakhchoukh et 
al. [Sj, [S], or the bias reducing iterative algorithm of Cornillon et al. [lU]. Lately, 
Dordonnat et al. [2] have developed a very comprehensive space state approach 
for modelling hourly french national electricity load, taking into account different 
levels of seasonality, calendar events and weather dependence, with one equation for 
each hour. Semiparametric methods and artificial neural networks to model meteo- 
rological effects and seasonal patterns are considered by Cottet and Smith [11], and 
by Liu et al. [25]. Poggi [2^ proposed a nonparametric approach based on kernel 
estimators to make forecasts on half-hourly french load curves and more technical 
studies from Antoniadis et al. [I], [2] based on functional kernel- wavelets can also 
be mentioned. In short, several methods have been implemented of various kinds, 
from nonparametric to parametric models with exogenous covariates through semi- 
parametric approaches, heteroskedasticity, space state models and neural networks, 
each of them providing excellent intraday forecast results on extensively averaged 
curves, such as national load. Let us conclude by refering the reader to the approach 
of Devalue et al. [12] , which is an aggregation of specialized experts combining a set 
of prediction outputs by independent forecasters. 

The issue of individual forecasting is complex and very few literature is available 
on the subject. The main difficulty of the individual load curves is the deep irregu- 
larity resulting from the human behavior. Indeed, we have to deal with phenomena 
that aggregation usually hides, such as high disturbances, unpredictable local be- 
haviors or thresholds during holiday periods. To the best of our knowledge, very 
few studies have been conducted in this area. In their work, Espinoza et al. [16] 
are concerned by the short-term load forecasting from a HV-LV substation, and 
Ghofrani et al. [19] propose to model real-time measurement data from customers' 
smart meters as the sum of a deterministic component and a gaussian noise signal. 
This paper suggests a statistical parametric approach adapted to an individual load 
curve which shows a substantial seasonal pattern and a thermosensitive behavior as 
prerequisites, highly relying on the time series theory. The authors anticipate that 
it could form an alternative to the commonly used methods for energy consumption 
forecasting of individual customers. In Section 2, we introduce a dynamic coupled 
modelling taking temperature into account. We also recall some known results on 
time series analysis, in particular the concepts of stationarity and causality. We 
recall some theoretical backgrounds which will be used as a basis for the empirical 
study on a real curve in the next section. As an exemple of load curve which we 
will investigate. Figure 11.11 displays the energy consumption of a thermosensitive 
customer and the interpolated temperatures measured every 3 hours by the nearest 
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weather station on the same period of 6 months. Section 3 is devoted to the detailed 
study on such a load curve based on time series analysis. The main motivations for 
proposing a coupled modelling are the linear relation between the logarithmic en- 
ergy consumption and the temperature on the one hand, and the seasonal behavior 
of the residuals on the other hand. Figure 11.21 represents the scatter plot between 
temperature and consumption for the same customer, in which one can observe the 
linear relationship. It also displays the residuals from the linear regression on the 
right-hand side. One shall investigate seasonality, stationarity and autocorrelations 
in the residuals from the linear regression, to build a suitable time series modelling 
and propose a forecasting algorithm, according to some criteria that will be speci- 
fied. For that purpose, one shall make an extensive use of the well-known Box and 
Jenkins methodology [3]. A short conclusion is given in Section 4. 




Figure 1.1. Energy consumption of a thermosensitive customer 
(left), temperature during the same period (right). 
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Figure 1.2. Scatter plot between energy consumption and temper- 
ature (left), residuals from the linear regression (right). 

Remark 1.1. In all the sequel, B stands for the backshift operator which operates 
on an element of a given time series to produce the previous element, BX^ = Xt-i- 
The backshift operator is raised to arbitrary integer powers so that B^Xt = Xt^g. 
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The difference operator V, defined as VX^ = {l — B)Xt, also generalises to arbitrary 
integer powers so that V.X^ = (1 - B')Xt and VXt = (1 - ByXf 



Let us start by recalling some usual tools related to time series analysis that we 
shall make repeatedly use of throughout the study. The reader will find more details 
on these results in Chapters 1 and 3 of 

Definition 2.1 (Stationarity) . A time series (Yt) is said to be weakly stationary if, 
for all t G Z, E[Y^^] < oo, E[Yt] = m and, for all s,t e Z, E^YtY^] = E[Yt_sYo]. 

In all the sequel, the term stationarity will always refer to the weak stationarity. 
Let us now focus on the stationary ARMA process and on the concept of causality. 

Definition 2.2 (ARMA). Let (1^) be a stationary time series with zero mean. It is 
said to be an ARMA(p, q) process if, for every t E Z, 



where (et) is a white noise of variance > and the parameters a G and b G M.'^. 
The equation (12.11) can be rewritten in the more compact form 



where the polynomials 

A{z) = 1 — aiz — ... — Qpz'^ and B{z) = 1 + biZ + . . . + bgz'^. 

In the particular case where p = 0, A{z) = 1 and (Yt) is a moving average MA{q) 
process. Likewise, if g = 0, B{z) = 1 and (Yt) is an autoregressive AR(p) process. 

Definition 2.3 (Causality). Let (Yt) be an ARMA(p, q) process for which the polyno- 
mials A and B have no common zeroes. Then, (Yt) is causal if and only if A(z) ^ 
for all 2; G C such that \z\ < 1. 

The causality enables to write the ARMA(p, q) process as an MA(oo) one. It guar- 
antees the existence of an unique stationary solution for the ARMA(p, q) process 
expressed as a linear process associated with (et), by virtue of the following result. 

Proposition 2.1. If A(z) 7^ for all z e C such that \z\ < 1, then the ARMA 
equation A(B)Yt = B(B)et has the unique stationary solution 
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A(B)Yt = B(B)et 
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(2.2) 




fc=0 



and the coefficients (ipk)km are determined by the relation 
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Proof. We refer the reader to Theorem 3.1.1 of |6] for a detailed version of Proposi- 
tion 12.11 and the proof that follows. □ 

One can observe that the stationarity of the solution and the causality of the ARMA 
process will often coincide on the real curves defined on the positive integers that 
will be considered in the following. Indeed, a zero inside the unit circle results in an 
explosive behavior of the process that cannot match with stationarity on N*. We 
now focus our attention on some practical tools related to identification methods for 
the orders of stationary AR(p) and MA(g) processes. 

Definition 2.4 (ACF). Let (Yt) be a stationary time series. The autocorrelation 
function p associated with (Yt) is defined, for all t G Z, as 

= W) 

where the autocovariance function 7(t) = E[yjyo] ~ E[yf]E[Yo]- 

Proposition 2.2. The stationary time series {Yt) with zero mean is a MA{q) process 
such that bg if and only if p{q) ^ and p{t) = for all \t\ > q. 

Denote by {4>t,j), for all t G N* and j G {1, . . . ,t}, the sequence given by = p(l) 
and, for all t > 2, by the Levinson-Durbin recursion. 



1 - k p{k) p{t) - J2 '^t-i, k p{t - k) 



k=l / \ k=l 

where, for all /c G {1, . . . ,t- 1}, (j)t,k = 4>t-i,k - <Pt,t4>t-i,t-k- The sequence {(f)t,t)t(£N* 
may be seen as the correlation between two residuals obtained after regressing Yt+i 
and Yi on the intermediate observations Y2, . . . ,Yt. Formally, for all t G N*, 

(j)t,t = Corr(yt+i - Fs-p{Y2,...,Yt}Yt+i,Yi - Ps-piya,...,^^^!) 

where Psp{Y2,...,Yt} stands for the L^— orthogonal projection operator of any square- 
integrable random variable on the closed subspace generated by F2, • • • , Yt- 

Definition 2.5 (PACE). Let (Yt) be a stationary time series with zero mean. The 
partial autocorrelation function a is defined as a(0) = 1, and, for all t G N*, as 

a{t) = (f)t,t- 

Proposition 2.3. // there exists a sequence (ipk) ^ such that (Yt) has the 

unique expression given by (12. 2 p with ipQ = 1, then the stationary time series (Yt) 
with zero mean is an AR{p) process such that Op ^ if and only if a{p) 7^ and 
a{t) = for all t > p. 

Proof. The proofs of Propositions 12.21 and 12.31 may be found in Chapter 3 of [6]. □ 

These identification techniques will be useful thereafter for orders selection in mod- 
els building. The hypothesis of stationarity will be tested via the commonly used 
Kwiatkowski-Phillips-Schmidt-Shin KPSS test [23] together with the unit root Aug- 
mented Dickey- Fuller ADF test [13] . As for the hypothesis of white noise, it will be 
evaluated through the portmanteau test of Ljung-Box [3], [S]- 
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In all the sequel, we denote by {Ct) the individual energy consumption of a given 
customer, for all 1 < t < T. We also denote by (Ut) the temperature associated 
with (Cf), supposed to be known for all — r + 2<t<T + H where H is the pre- 
diction horizon. In addition, we will use a variance-stabilizing Box-Cox logarithmic 
transformation (Yt) ensuring homoskedasticity, given, for all 1 < t < T, by 

(2.3) Fi = log(a + e'^) 

where /i is a positive parameter to evaluate, implying that Yt = fi in the particular 
case where Ct = 0. This safety precaution is justified by the possible use of relative 
criteria, such as the Mean Absolute Percentage Error. 

The dynamic coupled modelling. The first step of the modelling relies in a 
suitable way to remove the direct influence of the temperature on the consumption. 
As mentioned above, it exists a strong correlation between (Yt) and (Ut). This 
relationship is modeled through the linear regression given, for all 1 < t < T, by 

(2.4) Yt = Co + C{B)Ut + et 

where cq G M is an intercept, C{B) is a polynomial of order r such that, for all z E C, 

r 

C{z) = Y,Ckz'-' 

k=l 

and the unknown vector parameter c G M''^^ is estimated by standard least squares. 
The disturbance terms (et) will be regarded as a seasonal time series. In particular, 
(et) is said to follow a SARIMA(p, d, q) x (P, D, Q), modelling if, for all 1 < t < T, 

(2.5) (1 - BY{1 - B'')''A{B)A,{B)et = B{B)B,{B)Vt, 

according to Definition 9.6.1 of [6], where (Vt) is a white noise of variance cr^ > 0, 
and where the polynomials are defined, for all 2; G C, as 

p p 

A{z) = 1 - XI "'^^^ A(^) = 1 - 5Z 

k=l k=l 
q Q 

B{z) = 1 - 5^ hkz\ Bs{z) = l-Y, 

k=l k=l 

In this modelling, a e W, b e W, a e and /3 G are vector parameters 
estimated by generalized least squares. The differenced process (V^Vf^eJ in (12. 5p 
is a stationary solution of the ARMA causal process, i.e. A{z) 7^ and As{z) 7^ 
for all z G C such that \z\ < 1. 

Definition 2.6 (SARIMAX). In the particular framework of the study, a random pro- 
cess (Yt) will be said to follow a SARIMAX(p, d, q, r) x (P, D, Q)s coupled modelling 
if, for all 1 < t < T, it satisfies 



(2.6) 



Yt = Co + C{B)Ut + et, 

(1 - Bf{l - B-^)^A{B)As{B)et = B{B)B,{B)Vt. 
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The orders p, d, q, r, P, D, Q and s shall be evaluated following a well-known Box 
and Jenkins methodology [3]. Moreover, a straightforward calculation shows that 
(12.61) can be rewritten in the condensed form given, for all 1 < t < T, by 



(2.7) 



1 - Bfil - BTAB)MB) (Yt - C{B)Ut) = B{B)B,{B)Vu 



as soon as d + D > 0, which will be an assumption always verified as we shall explain 
in the next section. Indeed, cq vanishes by a single differentiation of [et). In light of 
foregoing, one can establish the following result, denoting by / the identity matrix 
of order T, Y and U the observation vector of order T and the design matrix of 
order T x (r + 1), respectively given by 



Y 



Y2 



\YtJ 



and U 



/I 

1 



Ut 
Ut-1 



Ut~2 



UT-r+l\ 
Ur-r 



\1 f/i f/o 



r+2 



Theorem 2.1. Assume thatU'U is invertible. Then, the differenced process {V'^V^St) 
where et is given, for all 1 < t < T, by the vector form 

(2.8) e = {I -U{U'Uy^U')Y 

is a stationary solution of the coupled model (12. 6p . 

together with a 
□ 



Proof. Theorem 12.11 is a direct consequence of Proposition 
straightforward least squares calculation. 



Remark 2.1. In the particular case where r = 0, we merely obtain e = Y — Y where 

1 ^ 



Y 



k=l 



and (12. 6p reduces to the usual SARIMA(p, d, q) x (P, D, Q)s modelling on the recen- 
tered load curve. In addition, as soon a.s d + D > 0, the influence of Y vanishes. 

The t— statistic associated with each parameter, exploiting the asymptotic normality 
of the estimates, will provide a significance testing procedure, as a confirmation of 
the criteria minimization strategy. Though, they will not be appropriate in the 
exogenous regression owing to the strong autocorrelation in the residuals, and will 
only be applied to the time series coefficients. 

Application to forecasting. Whatever prediction method one wishes to apply, 
see e.g. Chapters 5 and 9 of [3] or Chapter 5 of [6], the time series analysis of (12. 6p 
provides the predictor of (et) at stage T + 1, denoted by ?t+i- Let ct be the least 
squares estimator of c in (12. 4 p and assume that the order r is known. Then, it 
follows that 



(2.9) 



Yt+i — Co,t + Ck,TUT-k+2 + ^T^ 



1- 



k=l 
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Via the same lines, since (Ut) is supposed to be known for all —r + 2 < t < T + H , 
the predictor at horizon H is given by 

r 

(2.10) Yt+h = co,t + Cfc, rUr-k+H+i + ^t+h- 

k=l 



3. APPLICATION TO FORECASTING ON A LOAD CURVE 

By virtue of Theorem 12. 11 the apphcation of the coupled model (12. 6p to real curves 
merely consists in identifying the seasonality and the orders of differencing ensuring 
the stationarity of the residual sequence from the regression analysis. Moreover, 
from a careful analysis of the ACF and PACF, we will get a first approximation 
of the orders to be considered in the ARMA modelling. We shall first investigate 
seasonality through a Fourier spectrogram, then stationarity of the deseasonalized 
series and autocorrelations via ACF and PACF, and finally the overall randomness 
of successive innovations. Different models will be suggested and compared using 
bayesian criteria on the one hand, and then prediction criteria on the other hand. As 
mentioned above, the KPSS test [21], the ADF test [I3] and the Ljung-Box test [1], 
[5] will be used as statistical procedures for evaluating the hypothesis of stationarity, 
of unit root and of white noise up to a certain lag, respectively. 

From now on, for all 1 < t < T, (Q) is a load curve and (Yt) is the associated 
logarithmic process, given by (12. 3 p with = 5. In addition, (Ut) is the exogenous 
temperature supposed to be known for all —r + 2 < t < T + H and H is the 
prediction horizon. Denote also by (e't) the least squares estimated residual set from 
the regression analysis accordingly given, for all 1 < t < T, by 

r 

(3.1) et = Yt- cq^t - Cfc,T^f-fc+i, 

fc=i 

directly coming from (12. 8p . For a sake of simplicity, one shall take r = 1 without 
loss of generality. Besides, one observes on real curves that numerical results are 
very similar when r increases. Indeed, due to the natural phenomenon it represents, 
temperature Ut at time t is highly correlated to Ut-i and the use of lots of regressors 
to explain Ct in our modelling would often be redundant and generate statistically 
nonsignificant coefficients. 

Seasonality. Let us choose for example T = 730, that is 2 years of consumption. 
We consider the k—th empirical Fourier coefficient of (e't), given by 



2 

1 

t=l 



T 



where fk = 2'Kk/T is the k—th Fourier frequency. Figure [3111 displays the variation 
of y/yk on the Fourier frequency spectrum of (e^) on the left-hand side and the ones 
of (Vi2?t) and (V24?j) on the right-hand side, with unexploitable low frequencies 
truncated. 
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Figure 3.1. Fourier spectrograms of residuals (left) and seasonally 
differenced residuals of period 12 and 24 (right). 

Figure 13.11 shows that the estimated residual set (e't) has a seasonality and the 
abscissa of the main peak indicates that the pattern repeats itself 730 times on 2 
years, that is daily. The second peak also suggests a seasonality of 12 hours. On 
the right side, one can see that (Vi2?t) still has a periodicity whereas (V24?t) is 
quasi-aperiodic. This is the reason why one shall choose s = 24 in the SARIMA 
modelling, and that also leads us to choose D = 1 in (12. 6p . 

Stationarity. The KPSS and ADF statistical procedures both suggest that, on 6 
months of consumption, (e't) is not stationary whereas (VeJ, (V24?t) and (W2i£'t) 
are stationary. As a consequence, (ej) is difference- stationary and the differenced 
series can all be solutions of a causal ARMA modelling, which leads to ARIMA 
models with d = 1 and SARIMA models with d = and D = 1 oi d = 1 and D = 1. 

Autocorrelations. On the ACF and PACF of (et), one can clearly observe the 
daily periodicity of the series, as it appears on Figure 13.21 




Figure 3.2. ACF (left) and PACF (right) of the residuals (et). 
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In addition, the sample ACF of (V24?f) on Figure 13.31 below shows either an ex- 
ponential decay or a mixture with damped sine wave, while the sample PACE has 
a relatively large spike at lag 1 and can reasonably be considered as nonsignif- 
icant afterwards, with uncertainty up to lag 5. One can also detect a pattern 
around lag 24 on the ACF. This behavior suggests an AR(p) modelling with a 
seasonal moving average autoregression on the seasonally differenced series, that is 
a SARIMA(p, 0, 0) x (0, 1, Q)24 model with p < 5, and Q = 1 on (e*). 



I j 1 1 II I II 1 1 I j l l Mtg»;.gt!;. j^jg.- j^ . ;^^ -^WllWl 



Figure 3.3. ACF (left) and PACF (right) of the seasonally differ- 
enced residuals of period 24 (V24?t). 



Finally, Figure [3^ displays the sample ACF and PACF of (VV24?t). One can observe 
that the PACF tails off exponentially from lag 1 and that the ACF cuts off after 
lag 2, with small seasonal contributions. As a consequence, the series is likely to be 
generated by a SARIMA(p, 1, g) x (0, 1, (5)24 process with p = 1, q = 2 and Q = 1. 
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Figure 3.4. ACF (left) and PACF (right) of the doubly differenced 
residuals of period 24 (VV24£t). 



Modelling. This identification methodology seems quite rough, and one shall make 
the parameters vary in their neighborhood to determine the optimal modelling. 
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Table 13.11 gives the bayesian criteria associated with a set of SARIMAX models 
fitted on 6 months of consumption. AIC, SBC, LL and WN respectively stand 
for Akaike Information Criterion, Schwarz Bayesian Criterion, Log-Likelihood and 
White Noise. Let us recall that 

AIC = -21og£ + 2A; and SBC = -2 log/: + A; log T 

where C is the model likelihood and k is the number of parameters. In addition, VAR 
is the estimated variance of (Vt). The Ljung-Box portmanteau test is used to evaluate 
the hypothesis of white noise on the fitted innovations, considering arbitrarily that 
(Vt) is a white noise if it has nonsignificant autocorrelations up to lag 3. 
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Table 3.1. Bayesian criteria associated with a set of SARIMAX 
models on 6 months of consumption. 

Estimations come from an optimized mixing of conditional sum-of-squares and max- 
imum likelihood [7], [15], [18], [20]. In order to keep this section brief, only the most 
representative results are summarized in the table above even if more models have 
been evaluated. In conclusion, on the basis of the bayesian criteria, the most ade- 
quate modelling for the given load curve is a SARIMAX(3, 0, 2, 2) x (0, 1, 1)24 whose 
explicit expression is as follows, for all 28 < t < T = 4380, 

{Ft = Co + ciUt + C2Ut-i + St, 
St = £t-24 + ai{et-i - £:t-25) + a2{et-2 - £t-26) + a^ist-s - et-27) 
+ {Vt - hVt-l - h2Vt-2) - (3l{Vt-2A - hVt.2, - b2Vt.26), 

in which the estimates at stage T = 4380 are approximately given by 

Co = 7.9871, ci = 0.0166, Ca = -0.0420, = 0.4776, 03 = 0.9030, 

03 = -0.4305, bi = 0.0801, % = -0.8524, ^1 = -0.8125, = 0.0522, 
and the t— statistics of the time series coefficients, justifying their significance, by 
ta, = 12.58, ta, = 26.13, = 15.53, %^ = 2.50, = 28.17, t^^ = 75.42. 

Moreover, one can easily check that the estimation of the autoregressive polynomial 
A{z) = 1 — aiz — 02^^ — a^z^ is causal, for all 2 G C. The fitted values (Ct) are 
obtained via (12. 3p . that is, for all 28 < t < T, 

(3.3) Ct = e^' - 

with = 5. On Figures 13.51 and 13.61 the fitted values (It) and (Ct) from model 
(13. 2 p and (13. 3p are represented over the logarithmic load curve (Yf) and the real 
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load curve (Ct), respectively, with a zoom. The temperature (Ut) during the same 
period is represented on Figure 1X71 One can see that, except for some unpredictable 
local behaviors related to the individual nature of the curve, there is a pretty good 
adequation between modeled and real values. 




Load Curve 




4240 4260 4280 4300 4320 4340 

Hr 



Figure 3.6. SARIMAX(3, 0, 2, 2) x (0,1,1)24 modelling on the real 
curve in red over the observed values in blue. 



Forecasting. Our goal is now to propose an intraday forecasting methodology for 
the electrical consumption of individual customers. Let us start by introducing two 
criteria that will help us to select the most suitable forecasting model. Denote by 
Ct+1, . . . , Ct+nh the values of consecutive predictions at horizon H from time T. 
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Figure 3.7. Temperature measured on the same period by the near- 
est weather station. 



Then, the absolute criterion Ca and the relative criterion Cr are defined as follows, 



1 



NH 



' NH 



NH 



T+k 



G 



T+k 



and Cr — I Cr+k | jC'r+A: — Cx+k 

k=l \k=l / k=l 

Following the same lines as in the identification step, one has to make the parame- 
ters vary in their neighborhood to determine the most powerful forecasting model, 
considering the SARIMAX(3, 0, 2, 2) x (0, 1, 1)24 modelling as a basis. A Kalman 
filtering finite-history prediction method [13], [20], [22] is used to produce {YT+k) 
from the modelling, for all 1 < /c < NH, and the forecasts {Cx+k) are obtained by 



(3.4) 
with /i 



T+k 



5. 



Remark 3.1. It is important to note that Ct+i, ■ ■ ■ , Ct+h is not a sequence of predic- 
tions at horizon H but a sequence in which only the last component is a prediction at 
horizon H. By misuse of language, one shall consider in the sequel that a sequence 
of predictions at horizon H corresponds to H successive predictions without addi- 
tional meantime information. By extension, a sequence of predictions at horizon 
H needs A^ estimations of the parameters. 

Our experiments are based on A^ = 14 days of daily forecasting, i.e. H = 24, the 
coefficients are evaluated on 3 months of data, that is T = 2190, and the numerical 
results are summarized in the Table 13.21 below. 

The parsimony in the time series analysis is a central issue in forecasting appli- 
cations, and it is not surprising that the models minimizing Ca and Cr are not 
the same as those minimizing the bayesian criteria, and tend to reject uncertainty 
coming from overparametrization. Moreover, by selecting an optimal sliding win- 
dow in the modelling, one is able to slightly improve our results. For example, the 
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253.8 
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SARIMAX 
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1 


24 


254.1 


0.2403 



Table 3.2. Prediction criteria associated with a set of daily SARI- 
MAX forecasts on 3 months of consumption. 



SARIMAX(1, 0, 0, 2) x (0, 1, 1)24 model provides Ca = 231.0 and Cr = 0.2185 in the 
particular case where M = 1 month, that is T = 730. On Figure [331 we investigate 
the influence of the size of the sliding window M together with the one of the ex- 
ogenous regression dimension r on the relative criterion Cr for the latter modelling 
and the same experiment. This enables us to select the most powerful forecasting 
model for this particular curve. 




Figure 3.8. Influence of r and M on Cr calculated from 14 daily 
forecasts from the SARIMAX(1, 0, 0, r) x (0, 1, 1)24 modelling. 



Accordingly, one shall consider the SARIMAX(1, 0, 0, 2) x (0, 1, 1)24 modelling with 
M = 0.75 month, even if one can see that r is not playing a substantial role as 
soon as it is greater than 2 for a reason of strong correlation of the exogenous 
phenomenon, already mentioned above. One can also observe that prediction results 
can be improved when the parameters are evaluated on rather small amounts of 
data. This can once again be explained by the nature of the curve and the underlying 
human behavior whose consumption is highly influenced by local circumstances such 
as weather, holiday period, etc. Whereas too few data are not sufficient to take into 
account the seasonality of period 24 and properly estimate the very significant Pi 
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parameter, conversely, results tend to stabilize when M increases disproportionately. 
The explicit expression of the predictive model is as follows, for all 26 < t < T = 548, 




Yt = co + ciUt + C2Ut-i + Su 

£t = £t-2i + a'i{et-i - et-25) + Vt- f3iVt-24., 



in which the estimates at stage T = 548 are approximately given by 
Co = 7.2494, ci = 0.0497, Ca = -0.0629, = 0.3540, A = -0.7086, = 0.0708, 
and the t— statistics of the time series coefficients, justifying their significance, by 

fe, = 8.66, t^^ = 21.02. 

The estimation of the autoregressive polynomial A{z) = 1 — aiz is actually causal, 
for all 2 G C. On Figures l3^ and 13. 10[ we display an example of 7 daily predictions 
from the latter model and a sliding window of 0.75 month of consumption, for the 
logarithmic curve (Yt) as well as for the load curve (Ct). It also contains the 95% and 
90% prediction confidence intervals, rather large owing to the horizon of prediction. 




780 BOO 820 840 860 880 



Figure 3.9. SARIMAX(1, 0, 0, 2) x (0, 1, 1)24 daily predictions on the 
logarithmic curve in magenta over the observed values in blue. 



Note that Yt+i, . . . , Yt+nh only differs of 3% from Ir+i, • • • , Yt+nh when we con- 
sider the whole 14 daily predictions, that is NH = 336. Once again, one can notice 
that the coupled dynamic model provides very interesting results of prediction as 
soon as it is correctly specified, in spite of the noise on the load curve due to its 
individual nature. 
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4. CONCLUSION 

To conclude, we would like to draw the significance of the exogenous covariates to 
the reader's attention. Indeed, let us notice that in some cases, the empirical study 
suggests to select r = 0, meaning no temperature influence despite the manifest 
linear relation between the latter and the consumption. The authors interpret this 
observation by the fact that seasonality and local circumstances totally prevail on 
the effect of temperature and that all information has already been recovered by the 
deep study of the signal, resulting in very few significant coefficients and equivalent 
forecasts for r = 0, 1, ... In addition, one should not overlook the possible irrelevance 
of the temperature measured by the weather station related to a customer with- 
out other criterion than geolocation, especially when altitude is concerned, coastal 
residence, cloud covering, or more generally when substantial differences may be 
observed at the same time between the weather station and the customer's home. 
Also, we should not forget that the exogenous inputs assumed to be known during 
the prediction period of the time series are nothing else than predictions themselves, 
with all attendant uncertainty. In addition to the required seasonality, the relevance 
of the exogenous measures is a central issue for this approach to be applied to a 
load curve with profit. Nevertheless, and despite many irregularities due to the 
individual nature of the curves, this study shows that some very interesting results 
of daily forecasts may be obtained under certain conditions already described, and 
above all a careful study of each curve. Finally, this intraday forecasting approach 
has been conducted on a whole set of individual customers from EDF, leading to 
the same satisfactory conclusions. 
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